Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
Database (Oxford) ; 20222022 08 31.
Article in English | MEDLINE | ID: covidwho-2017881

ABSTRACT

The coronavirus disease 2019 (COVID-19) pandemic has been severely impacting global society since December 2019. The related findings such as vaccine and drug development have been reported in biomedical literature-at a rate of about 10 000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200 000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g. Diagnosis and Treatment) to the articles in LitCovid. The annotated topics have been widely used for navigating the COVID literature, rapidly locating articles of interest and other downstream studies. However, annotating the topics has been the bottleneck of manual curation. Despite the continuing advances in biomedical text-mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset-consisting of over 30 000 articles with manually reviewed topics-was created for training and testing. It is one of the largest multi-label classification datasets in biomedical scientific literature. Nineteen teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181 and 0.9394 for macro-F1-score, micro-F1-score and instance-based F1-score, respectively. Notably, these scores are substantially higher (e.g. 12%, higher for macro F1-score) than the corresponding scores of the state-of-art multi-label classification method. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development. Database URL https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/.


Subject(s)
COVID-19 , COVID-19/epidemiology , Data Mining/methods , Databases, Factual , Humans , PubMed , Publications
2.
Lancet Digit Health ; 4(7): e542-e557, 2022 07.
Article in English | MEDLINE | ID: covidwho-1882680

ABSTRACT

BACKGROUND: Updatable estimates of COVID-19 onset, progression, and trajectories underpin pandemic mitigation efforts. To identify and characterise disease trajectories, we aimed to define and validate ten COVID-19 phenotypes from nationwide linked electronic health records (EHR) using an extensible framework. METHODS: In this cohort study, we used eight linked National Health Service (NHS) datasets for people in England alive on Jan 23, 2020. Data on COVID-19 testing, vaccination, primary and secondary care records, and death registrations were collected until Nov 30, 2021. We defined ten COVID-19 phenotypes reflecting clinically relevant stages of disease severity and encompassing five categories: positive SARS-CoV-2 test, primary care diagnosis, hospital admission, ventilation modality (four phenotypes), and death (three phenotypes). We constructed patient trajectories illustrating transition frequency and duration between phenotypes. Analyses were stratified by pandemic waves and vaccination status. FINDINGS: Among 57 032 174 individuals included in the cohort, 13 990 423 COVID-19 events were identified in 7 244 925 individuals, equating to an infection rate of 12·7% during the study period. Of 7 244 925 individuals, 460 737 (6·4%) were admitted to hospital and 158 020 (2·2%) died. Of 460 737 individuals who were admitted to hospital, 48 847 (10·6%) were admitted to the intensive care unit (ICU), 69 090 (15·0%) received non-invasive ventilation, and 25 928 (5·6%) received invasive ventilation. Among 384 135 patients who were admitted to hospital but did not require ventilation, mortality was higher in wave 1 (23 485 [30·4%] of 77 202 patients) than wave 2 (44 220 [23·1%] of 191 528 patients), but remained unchanged for patients admitted to the ICU. Mortality was highest among patients who received ventilatory support outside of the ICU in wave 1 (2569 [50·7%] of 5063 patients). 15 486 (9·8%) of 158 020 COVID-19-related deaths occurred within 28 days of the first COVID-19 event without a COVID-19 diagnoses on the death certificate. 10 884 (6·9%) of 158 020 deaths were identified exclusively from mortality data with no previous COVID-19 phenotype recorded. We observed longer patient trajectories in wave 2 than wave 1. INTERPRETATION: Our analyses illustrate the wide spectrum of disease trajectories as shown by differences in incidence, survival, and clinical pathways. We have provided a modular analytical framework that can be used to monitor the impact of the pandemic and generate evidence of clinical and policy relevance using multiple EHR sources. FUNDING: British Heart Foundation Data Science Centre, led by Health Data Research UK.


Subject(s)
COVID-19 , COVID-19/epidemiology , COVID-19 Testing , Cohort Studies , Electronic Health Records , England/epidemiology , Humans , SARS-CoV-2 , State Medicine
3.
Orphanet J Rare Dis ; 17(1): 166, 2022 04 12.
Article in English | MEDLINE | ID: covidwho-1789126

ABSTRACT

BACKGROUND: Several common conditions have been widely recognised as risk factors for COVID-19 related death, but risks borne by people with rare diseases are largely unknown. Therefore, we aim to estimate the difference of risk for people with rare diseases comparing to the unaffected. METHOD: To estimate the correlation between rare diseases and COVID-19 related death, we performed a retrospective cohort study in Genomics England 100k Genomes participants, who tested positive for Sars-Cov-2 during the first wave (16-03-2020 until 31-July-2020) of COVID-19 pandemic in the UK (n = 283). COVID-19 related mortality rates were calculated in two groups: rare disease patients (n = 158) and unaffected relatives (n = 125). Fisher's exact test and logistic regression was used for univariable and multivariable analysis, respectively. RESULTS: People with rare diseases had increased risk of COVID19-related deaths compared to the unaffected relatives (OR [95% CI] = 3.47 [1.21- 12.2]). Although, the effect was insignificant after adjusting for age and number of comorbidities (OR [95% CI] = 1.94 [0.65-5.80]). Neurology and neurodevelopmental diseases was significantly associated with COVID19-related death in both univariable (OR [95% CI] = 4.07 [1.61-10.38]) and multivariable analysis (OR [95% CI] = 4.22 [1.60-11.08]). CONCLUSIONS: Our results showed that rare disease patients, especially ones affected by neurology and neurodevelopmental disorders, in the Genomics England cohort had increased risk of COVID-19 related death during the first wave of the pandemic in UK. The high risk is likely associated with rare diseases themselves, while we cannot rule out possible mediators due to the small sample size. We would like to raise the awareness that rare disease patients may face increased risk for COVID-19 related death. Proper considerations for rare disease patients should be taken when relevant policies (e.g., returning to workplace) are made.


Subject(s)
COVID-19 , COVID-19/genetics , Cohort Studies , England , Genomics , Humans , Pandemics , Rare Diseases/epidemiology , Rare Diseases/genetics , Retrospective Studies , SARS-CoV-2
4.
IEEE J Biomed Health Inform ; 26(1): 423-435, 2022 01.
Article in English | MEDLINE | ID: covidwho-1666255

ABSTRACT

The ability to perform accurate prognosis is crucial for proactive clinical decision making, informed resource management and personalised care. Existing outcome prediction models suffer from a low recall of infrequent positive outcomes. We present a highly-scalable and robust machine learning framework to automatically predict adversity represented by mortality and ICU admission and readmission from time-series of vital signs and laboratory results obtained within the first 24 hours of hospital admission. The stacked ensemble platform comprises two components: a) an unsupervised LSTM Autoencoder that learns an optimal representation of the time-series, using it to differentiate the less frequent patterns which conclude with an adverse event from the majority patterns that do not, and b) a gradient boosting model, which relies on the constructed representation to refine prediction by incorporating static features. The model is used to assess a patient's risk of adversity and provides visual justifications of its prediction. Results of three case studies show that the model outperforms existing platforms in ICU and general ward settings, achieving average Precision-Recall Areas Under the Curve (PR-AUCs) of 0.891 (95% CI: 0.878-0.939) for mortality and 0.908 (95% CI: 0.870-0.935) in predicting ICU admission and readmission.


Subject(s)
Electronic Health Records , Machine Learning , Hospitalization , Humans , Length of Stay , ROC Curve , Retrospective Studies
5.
The Lancet ; 398, 2021.
Article in English | ProQuest Central | ID: covidwho-1537175

ABSTRACT

Background The ongoing COVID-19 pandemic has had a high incidence and mortality so far. Several common conditions have been widely recognised as risk factors for COVID-19-related death, but the risks for patients with rare diseases are largely unknown. Therefore, we aimed to estimate the difference in the risk of mortality for patients with rare diseases compared to the risk for the general population. Methods To estimate the correlation between rare diseases and COVID-19-related death, we performed a retrospective cohort study of Genomics England participants who tested positive for SARS-CoV-2 (n=283) during the first wave (March 16 to July 31, 2020) of the COVID-19 pandemic in the UK. Participants with one of 190 rare diseases and their biological relatives (mostly to the first or second degree) were recruited by Genomics England, where patients had a provisional diagnosis but not a molecular diagnosis. COVID-19-related mortality rates were calculated in two groups: patients with rare diseases and unaffected relatives. Univariable analysis on the associations between rare diseases and COVID-19-related death was done with Fisher's exact test. Adjusted odds ratio (OR;for age and number of common comorbidities) was calculated with multivariable logistic regression. The study was approved by Genomic England (reference GEL-79143). Findings There were 20 (13%) COVID-19-related deaths in patients with rare diseases (n=158) and five (4%) COVID-19-related deaths in unaffected relatives (n=125), translating to an increased risk of mortality in patients with rare diseases (OR 3·47 [95% CI 1·21–12·2], Fisher's exact p=0·011). A greater OR was observed in participants younger than 60 years (univariable 5·11 [0·56–245·16];p=0·212), although the trend was not significant. Having a rare disease (multivariable 1·94 [0·65–5·80];p=0·233) and the number of comorbidities (multivariable 2·10 [0·79–5·58];p=0·135) contributed similarly to COVID-19-related death in multivariable logistic regression analysis in this cohort. Sex was not found to affect the mortality rate. Interpretation Our results show that patients with rare diseases in the Genomics England cohort had an increased risk of COVID-19-related mortality during the first wave of the pandemic in UK. The high risk is probably associated with the rare diseases themselves, but we cannot rule out possible mediators due to the small sample size. We would like to raise the awareness that patients with rare diseases might face increased risk for COVID-19-related death. Proper considerations for these patients should be taken when relevant decisions (eg, returning to a workplace) are made. Funding HZ is supported by Wellcome Trust ITPA funding (grant number PIII026/013). HW is supported by Wellcome Trust ITPA (grant number PIII0054/005) and Medical Research Council (grant number MR/S004149/2).

6.
J Am Med Inform Assoc ; 28(4): 791-800, 2021 03 18.
Article in English | MEDLINE | ID: covidwho-1142659

ABSTRACT

OBJECTIVE: Risk prediction models are widely used to inform evidence-based clinical decision making. However, few models developed from single cohorts can perform consistently well at population level where diverse prognoses exist (such as the SARS-CoV-2 [severe acute respiratory syndrome coronavirus 2] pandemic). This study aims at tackling this challenge by synergizing prediction models from the literature using ensemble learning. MATERIALS AND METHODS: In this study, we selected and reimplemented 7 prediction models for COVID-19 (coronavirus disease 2019) that were derived from diverse cohorts and used different implementation techniques. A novel ensemble learning framework was proposed to synergize them for realizing personalized predictions for individual patients. Four diverse international cohorts (2 from the United Kingdom and 2 from China; N = 5394) were used to validate all 8 models on discrimination, calibration, and clinical usefulness. RESULTS: Results showed that individual prediction models could perform well on some cohorts while poorly on others. Conversely, the ensemble model achieved the best performances consistently on all metrics quantifying discrimination, calibration, and clinical usefulness. Performance disparities were observed in cohorts from the 2 countries: all models achieved better performances on the China cohorts. DISCUSSION: When individual models were learned from complementary cohorts, the synergized model had the potential to achieve better performances than any individual model. Results indicate that blood parameters and physiological measurements might have better predictive powers when collected early, which remains to be confirmed by further studies. CONCLUSIONS: Combining a diverse set of individual prediction models, the ensemble method can synergize a robust and well-performing model by choosing the most competent ones for individual patients.


Subject(s)
COVID-19/mortality , Models, Statistical , Prognosis , Adult , Aged , Aged, 80 and over , COVID-19/epidemiology , COVID-19/prevention & control , China/epidemiology , Female , Humans , Male , Middle Aged , Risk Assessment/methods , SARS-CoV-2 , United Kingdom/epidemiology
7.
J Biomed Inform ; 116: 103728, 2021 04.
Article in English | MEDLINE | ID: covidwho-1131454

ABSTRACT

BACKGROUND: Diagnostic or procedural coding of clinical notes aims to derive a coded summary of disease-related information about patients. Such coding is usually done manually in hospitals but could potentially be automated to improve the efficiency and accuracy of medical coding. Recent studies on deep learning for automated medical coding achieved promising performances. However, the explainability of these models is usually poor, preventing them to be used confidently in supporting clinical practice. Another limitation is that these models mostly assume independence among labels, ignoring the complex correlations among medical codes which can potentially be exploited to improve the performance. METHODS: To address the issues of model explainability and label correlations, we propose a Hierarchical Label-wise Attention Network (HLAN), which aimed to interpret the model by quantifying importance (as attention weights) of words and sentences related to each of the labels. Secondly, we propose to enhance the major deep learning models with a label embedding (LE) initialisation approach, which learns a dense, continuous vector representation and then injects the representation into the final layers and the label-wise attention layers in the models. We evaluated the methods using three settings on the MIMIC-III discharge summaries: full codes, top-50 codes, and the UK NHS (National Health Service) COVID-19 (Coronavirus disease 2019) shielding codes. Experiments were conducted to compare the HLAN model and label embedding initialisation to the state-of-the-art neural network based methods, including variants of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). RESULTS: HLAN achieved the best Micro-level AUC and F1 on the top-50 code prediction, 91.9% and 64.1%, respectively; and comparable results on the NHS COVID-19 shielding code prediction to other models: around 97% Micro-level AUC. More importantly, in the analysis of model explanations, by highlighting the most salient words and sentences for each label, HLAN showed more meaningful and comprehensive model interpretation compared to the CNN-based models and its downgraded baselines, HAN and HA-GRU. Label embedding (LE) initialisation significantly boosted the previous state-of-the-art model, CNN with attention mechanisms, on the full code prediction to 52.5% Micro-level F1. The analysis of the layers initialised with label embeddings further explains the effect of this initialisation approach. The source code of the implementation and the results are openly available at https://github.com/acadTags/Explainable-Automated-Medical-Coding. CONCLUSION: We draw the conclusion from the evaluation results and analyses. First, with hierarchical label-wise attention mechanisms, HLAN can provide better or comparable results for automated coding to the state-of-the-art, CNN-based models. Second, HLAN can provide more comprehensive explanations for each label by highlighting key words and sentences in the discharge summaries, compared to the n-grams in the CNN-based models and the downgraded baselines, HAN and HA-GRU. Third, the performance of deep learning based multi-label classification for automated coding can be consistently boosted by initialising label embeddings that captures the correlations among labels. We further discuss the advantages and drawbacks of the overall method regarding its potential to be deployed to a hospital and suggest areas for future studies.


Subject(s)
COVID-19 , Clinical Coding/methods , Neural Networks, Computer , SARS-CoV-2 , COVID-19/epidemiology , Clinical Coding/statistics & numerical data , Deep Learning , Electronic Health Records/statistics & numerical data , Humans , Medical Informatics , Pandemics/statistics & numerical data , United Kingdom/epidemiology
8.
Eur J Prev Cardiol ; 28(14): 1599-1609, 2021 12 20.
Article in English | MEDLINE | ID: covidwho-1091243

ABSTRACT

AIMS: Cardiovascular diseases (CVDs) increase mortality risk from coronavirus infection (COVID-19). There are also concerns that the pandemic has affected supply and demand of acute cardiovascular care. We estimated excess mortality in specific CVDs, both 'direct', through infection, and 'indirect', through changes in healthcare. METHODS AND RESULTS: We used (i) national mortality data for England and Wales to investigate trends in non-COVID-19 and CVD excess deaths; (ii) routine data from hospitals in England (n = 2), Italy (n = 1), and China (n = 5) to assess indirect pandemic effects on referral, diagnosis, and treatment services for CVD; and (iii) population-based electronic health records from 3 862 012 individuals in England to investigate pre- and post-COVID-19 mortality for people with incident and prevalent CVD. We incorporated pre-COVID-19 risk (by age, sex, and comorbidities), estimated population COVID-19 prevalence, and estimated relative risk (RR) of mortality in those with CVD and COVID-19 compared with CVD and non-infected (RR: 1.2, 1.5, 2.0, and 3.0).Mortality data suggest indirect effects on CVD will be delayed rather than contemporaneous (peak RR 1.14). CVD service activity decreased by 60-100% compared with pre-pandemic levels in eight hospitals across China, Italy, and England. In China, activity remained below pre-COVID-19 levels for 2-3 months even after easing lockdown and is still reduced in Italy and England. For total CVD (incident and prevalent), at 10% COVID-19 prevalence, we estimated direct impact of 31 205 and 62 410 excess deaths in England (RR 1.5 and 2.0, respectively), and indirect effect of 49 932 to 99 865 deaths. CONCLUSION: Supply and demand for CVD services have dramatically reduced across countries with potential for substantial, but avoidable, excess mortality during and after the pandemic.


Subject(s)
COVID-19 , Cardiovascular Diseases , Cardiovascular Diseases/diagnosis , Cardiovascular Diseases/epidemiology , Communicable Disease Control , Humans , Pandemics , SARS-CoV-2
9.
BMC Med ; 19(1): 23, 2021 01 21.
Article in English | MEDLINE | ID: covidwho-1067228

ABSTRACT

BACKGROUND: The National Early Warning Score (NEWS2) is currently recommended in the UK for the risk stratification of COVID-19 patients, but little is known about its ability to detect severe cases. We aimed to evaluate NEWS2 for the prediction of severe COVID-19 outcome and identify and validate a set of blood and physiological parameters routinely collected at hospital admission to improve upon the use of NEWS2 alone for medium-term risk stratification. METHODS: Training cohorts comprised 1276 patients admitted to King's College Hospital National Health Service (NHS) Foundation Trust with COVID-19 disease from 1 March to 30 April 2020. External validation cohorts included 6237 patients from five UK NHS Trusts (Guy's and St Thomas' Hospitals, University Hospitals Southampton, University Hospitals Bristol and Weston NHS Foundation Trust, University College London Hospitals, University Hospitals Birmingham), one hospital in Norway (Oslo University Hospital), and two hospitals in Wuhan, China (Wuhan Sixth Hospital and Taikang Tongji Hospital). The outcome was severe COVID-19 disease (transfer to intensive care unit (ICU) or death) at 14 days after hospital admission. Age, physiological measures, blood biomarkers, sex, ethnicity, and comorbidities (hypertension, diabetes, cardiovascular, respiratory and kidney diseases) measured at hospital admission were considered in the models. RESULTS: A baseline model of 'NEWS2 + age' had poor-to-moderate discrimination for severe COVID-19 infection at 14 days (area under receiver operating characteristic curve (AUC) in training cohort = 0.700, 95% confidence interval (CI) 0.680, 0.722; Brier score = 0.192, 95% CI 0.186, 0.197). A supplemented model adding eight routinely collected blood and physiological parameters (supplemental oxygen flow rate, urea, age, oxygen saturation, C-reactive protein, estimated glomerular filtration rate, neutrophil count, neutrophil/lymphocyte ratio) improved discrimination (AUC = 0.735; 95% CI 0.715, 0.757), and these improvements were replicated across seven UK and non-UK sites. However, there was evidence of miscalibration with the model tending to underestimate risks in most sites. CONCLUSIONS: NEWS2 score had poor-to-moderate discrimination for medium-term COVID-19 outcome which raises questions about its use as a screening tool at hospital admission. Risk stratification was improved by including readily available blood and physiological parameters measured at hospital admission, but there was evidence of miscalibration in external sites. This highlights the need for a better understanding of the use of early warning scores for COVID.


Subject(s)
COVID-19/diagnosis , Early Warning Score , Aged , COVID-19/epidemiology , COVID-19/virology , Cohort Studies , Electronic Health Records , Female , Humans , Male , Middle Aged , Pandemics , Prognosis , SARS-CoV-2/isolation & purification , State Medicine , United Kingdom/epidemiology
10.
Engineering (Beijing) ; 8: 116-121, 2022 Jan.
Article in English | MEDLINE | ID: covidwho-947208

ABSTRACT

Coronavirus disease 2019 (COVID-19) has become a worldwide pandemic. Hospitalized patients of COVID-19 suffer from a high mortality rate, motivating the development of convenient and practical methods that allow clinicians to promptly identify high-risk patients. Here, we have developed a risk score using clinical data from 1479 inpatients admitted to Tongji Hospital, Wuhan, China (development cohort) and externally validated with data from two other centers: 141 inpatients from Jinyintan Hospital, Wuhan, China (validation cohort 1) and 432 inpatients from The Third People's Hospital of Shenzhen, Shenzhen, China (validation cohort 2). The risk score is based on three biomarkers that are readily available in routine blood samples and can easily be translated into a probability of death. The risk score can predict the mortality of individual patients more than 12 d in advance with more than 90% accuracy across all cohorts. Moreover, the Kaplan-Meier score shows that patients can be clearly differentiated upon admission as low, intermediate, or high risk, with an area under the curve (AUC) score of 0.9551. In summary, a simple risk score has been validated to predict death in patients infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2); it has also been validated in independent cohorts.

SELECTION OF CITATIONS
SEARCH DETAIL